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POLYMORPHISMS IN THE HUMAN ALPHA4 INTEGRIN SUBUNIT GENE, SUITABLE FOR DIAGNOSIS AND 
TREATMENT OF INTEGRIN LIGAND MEDIATED DISEASES 

This invention relates to polymorphisms in the human ou integrin subunit gene. The 
invention also relates to methods and materials for analysing allelic variation in the ou integrin 
5 subunit gene, and to the use of ou integrin subunit polymorphism in the diagnosis and 

treatment of integrin ligand mediated diseases such as multiple sclerosis, rheumatoid arthritis, 
atherosclerosis and allergic asthma. 

The integrins are a family of heterodimeric cell surface receptors that are composed of 
noncovalently associated glycoprotein subunits (a and p) and are involved in the adhesion of 
10 cells to other cells or to extracellular matrix. The interactions between integrins and their 

protein ligands are fundamental for maintaining cell function, for example by tethering cells at a 
particular location, facilitating cell migration, or providing survival signals to cells from their 
environment. Ligands recognised by integrins include extracellular matrix proteins, such as 
collagen and fibronectin; plasma proteins, such as fibrinogen; and cell surface molecules, such 
15 as transmembrane proteins of the immunoglobulin superfamily and cell-bound complement. 
There are at least 14 different human integrin a subunits and at least 8 different P subunits and 
each p subunit can form a heterodimer with one or more a subunits. The specificity of the 
interaction between integrin and ligand is governed by the a and p subunit composition. 

The ou integrin subunit comprises 999 amino acids and is formed from a 1038 amino 
20 acid precursor by the cleavage of a 39 amino acid N-terminal signal peptide. The core protein 
molecular weight is 1 1 1 kDa. There are 1 1 N-glycosylation sites in the extracellular region 
and the protein expressed on the cell surface usually has a molecular weight of 145 kDa, 
although it can also exist as a 180 kDa isoform. The 145 kDa form can be partially cleaved 
into 80 and 70 kDa fragments. The extracellular domain comprises amino acid residues 1-944, 
25 the transmembrane domain residues 945-967 and there is a short intracellular domain 

comprising residues 968-999. The N-terminal 432 amino acids contain seven sequence repeats 
which are thought to fold into a seven-bladed p-propeller. Ligands and a putative magnesium 
ion are predicted to bind to the upper face of the p-propeller while there are three calcium 
binding motifs on the lower face. 
30 The ou subunit is known to form a heterodimer with either the P) or p 7 subunits. The 

integrin oup., also known as Very Late Antigen-4 (VLA-4) or CD49d/CD29, is expressed on 
numerous hematopoietic cells, including hematopoietic precursors, peripheral and cytotoxic T 




WO 00/17394 PCT/GB99/03071 

-2- 

lymphocytes, B lymphocytes, monocytes, thymocytes and eosinophils, and established cell 
lines. a 4 pi has two main ligands, Vascular Cell Adhesion Molecule- 1 (VCAM-1), also known 
as CD 106, an immunoglobulin superfamily member expressed on the surface of activated 
vascular endothelial cells and a variety of other cells including dendritic cells, macrophages and 
5 fibroblasts, and an isoform of fibronectin containing the alternatively spliced type III 

connecting segment (CS-1 fibronectin). a 4 p 7 also recognises VCAM-1 and CS-1 fibronectin as 
ligands but will preferentially bind to Mucosal Addressin Cell Adhesion Molecule- 1 
(MAdCAM-1), another immunoglobulin superfamily member expressed on vascular 
endothelial cells, mainly in the small intestine and to a lesser extent the colon and spleen. a$i 
10 is expressed on lymphocytes that preferentially home to gastrointestinal mucosa and gut- 
associated lymphoid tissue and may have a role in maintaining mucosal immunity. 

The activation and extravasation of blood leukocytes plays a major role in the 
development and progression of inflammatory diseases. Cell adhesion to the vascular 
endothelium is required before cells migrate from the blood into inflamed tissue and is 
15 mediated by specific interactions between cell adhesion molecules on the surface of vascular 
endothelial cells and circulating leukocytes, cu integrins are believed to have an important role 
in the recruitment of lymphocytes, monocytes and eosinophils during inflammation. 

The affinity of leukocyte integrins for their ligands is normally low but activation of 
leukocytes increases integrin affinity. At sites of inflammation, leukocyte integrins are thought 
20 to be activated by chemokines which act via receptors on the leukocyte surface. Integrin 
affinity is thought to be regulated by conformational changes in the integrin subunits induced 
by intracellular signalling pathways acting on the integrin cytoplasmic tails. 

Expression of cu integrin ligands is upregulated at sites of inflammation. VCAM-1 and 
MAdCAM-1 expression is upregulated on endothelial cells in vitro by inflammatory cytokines. 
25 VCAM-1 expression is upregulated in human inflammatory diseases such as rheumatoid 
arthritis, multiple sclerosis, allergic asthma and atherosclerosis while CS-1 fibronectin 
expression is upregulated in rheumatoid arthritis. MAdCAM-1 expression is upregulated in 
murine models of inflammatory bowel disease and insulin-dependent diabetes. 

Monoclonal antibodies directed against the ou integrin subunit have been shown to be 
30 effective in a number of animal models of human inflammatory diseases including multiple 
sclerosis, rheumatoid arthritis, allergic asthma, contact dermatitis, transplant rejection, insulin- 
dependent diabetes, inflammatory bowel disease, and glomerulonephritis. 
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a 4 Pi /ligand binding has also been implicated in T-cell proliferation, B-cell localisation 
to germinal centres, haematopoietic progenitor cell localisation in the bone marrow, 
angiogenesis, placental development, muscle development and tumour cell metastasis. 

Integrins recognise short peptide motifs in their ligands. The minimal cc 4 integrin 
5 binding epitope in CS-1 is the tripeptide leucine-aspartic acid-valine (LDV) while VCAM-1 
contains the similar sequence isoleucine-aspartic acid-serine (IDS). cuP? binds to a leucine- 
aspartic acid-threonine (LDT) motif in MAdCAM-1. Small molecule inhibitors of ligand 
binding to ou integrins have been designed based on these short peptide motifs. ct 4 integrin 
antagonists, monoclonal antibodies directed at a4 integrins or their ligands and inhibitors of cu 
10 integrin ligand expression may have utility in the treatment of autoimmune, allergic and 
vascular inflammatory diseases, the prevention of tumour metastasis and in mobilisation of 
haematopoietic progenitor cells from bone marrow prior to tumour chemotherapy. 

A cDNA encoding the a 4 integrin subunit has been cloned and published as a EMBL 
Accession number: L12002 (3567 bp). Promoter sequence has been published as EMBL 
15 Accession numbers L26059 and M62841 . All positions herein relate to the position therein 
unless stated otherwise or apparent from the context. 

Szabo and Mclntyre (1995), Molecular Immunology 32, 1543-54, disclosed a SNP in 
human integrin ot 4 subunit at position 3061, which produces a Gin to Arg change in the 
subunit. 

20 One approach is to use knowledge of polymorphisms to help identify patients most 

suited to therapy with particular pharmaceutical agents (this is often termed 
"pharmacogenetics") . Pharmacogenetics can also be used in pharmaceutical research to assist 
the drug selection process. Polymorphisms are used in mapping the human genome and to 
elucidate the genetic component of diseases. The reader is directed to the following references 

25 for background details on pharmacogenetics and other uses of polymorphism detection: 

Linder etal (1997), Clinical Chemistry, 43, 254; Marshall (1997), Nature Biotechnology, 15, 
1249; International Patent Application WO 97/40462, Spectra Biomedical; and Schafer et al. 
(1998), Nature Biotechnology, 16, 33. 

Clinical trials have shown that patient response to treatment with pharmaceuticals is 

30 often heterogeneous. Thus there is a need for improved approaches to pharmaceutical agent 
design and therapy. 
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Variations in polypeptide sequence will be referred to as follows: original amino acid 
(using 1 or 3 letter nomenclature) , position, new amino acid. For (a hypothetical) example 
"D25K" or "Asp25Lys" means that at position 25 an aspartic acid (D) has been changed to 
lysine (K). Multiple mutations in one polypeptide will be shown between square brackets with 
5 individual mutations separated by commas. 

The present invention is based on the discovery of five single nucleotide polymorphisms 
(SNPs) in the coding region of the human a 4 integrin subunit gene and eight in the promoter 
region. 

According to one aspect of the present invention there is provided a method for the 
10 diagnosis of a single nucleotide polymorphism in a a 4 integrin subunit in a human, which 

method comprises determining the sequence of the nucleic acid of the human at one or more of 
the following positions: 

positions 740, 2273, 2446, 3311 and 3506 in the coding region of a 4 integrin subunit gene as 
defined by the positions in EMBL ACCESSION NO. L12002; 
15 position 967 in the promoter region of ou integrin subunit gene as defined by the position in 
EMBL ACCESSION NO. L26509; and 

positions 184, 238, 331, 436, 676, 1010, or 1 1 15 in the promoter region of a 4 integrin subunit 
gene as defined by the position in EMBL ACCESSION NO. M26841 ; 
and determining the status of the human by reference to polymorphism in the ou integrin 
20 subunit gene. 

According to another aspect of the present invention there is provided a method for the 
diagnosis of a single nucleotide polymorphism in a a 4 integrin subunit in a human, which 
method comprises determining the sequence of the nucleic acid of the human at one or more of 
positions 740, 2273, 2446, 331 1 and 3506 in the a 4 integrin subunit gene as defined by the 
25 positions in EMBL ACCESSION NO. L12002, and determining the status of the human by 
reference to polymorphism in the ou integrin subunit gene. 

The term human includes both a human having or suspected of having a a 4 integrin 
subunit ligand mediated disease and an asymptomatic human who may be tested for 
predisposition or susceptibility to such disease. At each position the human may be 
30 homozygous for an allele or the human may be a heterozygote. 

The polymorphisms identified in the present invention which occur in the promoter 
region are not expected to alter any amino acid sequence, but several of the polymorphisms 
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affect transcription sites within the promoter region and thus may affect the transcription of the 
gene. The reader is referred to Example 3 below. 

Assays, for example reporter-based assays, may be devised to detect whether one or 
more of the above polymorphisms affect transcription levels and/or message stability. 

5 Individuals who carry particular allelic variants of the gene may therefore exhibit 

differences in their ability to regulate protein biosynthesis under different physiological 
conditions and will display altered abilities to react to different diseases. In addition, 
differences in protein regulation arising as a result of allelic variation may have a direct effect 
on the response of an individual to drug therapy. The diagnostic methods of the invention may 

10 be useful both to predict the clinical response to such agents and to determine therapeutic 
dose. 

In one embodiment of the invention preferably the method for diagnosis described 
herein is one in which the single nucleotide polymorphism at position 740 is presence of C 
and/or T. 

1 5 In another embodiment of the invention preferably the method for diagnosis described 

herein is one in which the single nucleotide polymorphism at position 2273 is presence of A 
and/or G. 

In another embodiment of the invention preferably the method for diagnosis described 
herein is one in which the single nucleotide polymorphism at position 2446 is presence of C 
20 and/or T. Testing for the presence of the T allele at this position is especially preferred 

because, without wishing to be bound by theoretical considerations, of its association with a 
significant amino acid change in the polypeptide sequence of a 4 integrin. 

In another embodiment of the invention preferably the method for diagnosis described 
herein is one in which the single nucleotide polymorphism at position 331 1 is presence of T 
25 and/or C. 

In another embodiment of the invention preferably the method for diagnosis described 
herein is one in which the single nucleotide polymorphism at position 3506 is presence of C 
and/or T. 

In another embodiment of the invention preferably the method for diagnosis described 
30 herein is one in which the single nucleotide polymorphism at position 967 is presence of G 
and/or A. 
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In another embodiment of the invention preferably the method for diagnosis described 
herein is one in which the single nucleotide polymorphism at position 1 84 is presence of A 
and/or G. 

In another embodiment of the invention preferably the method for diagnosis described 
5 herein is one in which the single nucleotide polymorphism at position 238 is presence of C 
and/or T. 

In another embodiment of the invention preferably the method for diagnosis described 
herein is one in which the single nucleotide polymorphism at position 33 1 is presence of C 
and/or T. 

10 In another embodiment of the invention preferably the method for diagnosis described 

herein is one in which the single nucleotide polymorphism at position 436 is presence of C 
and/or T. 

In another embodiment of the invention preferably the method for diagnosis described 
herein is one in which the single nucleotide polymorphism at position 676 is presence of C 
15 and/or T. 

In another embodiment of the invention preferably the method for diagnosis described 
herein is one in which the single nucleotide polymorphism at position 1010 is presence of C 
and/or A. 

In another embodiment of the invention preferably the method for diagnosis described 
20 herein is one in which the single nucleotide polymorphism at position 1 1 15 is presence of C 
and/or T. 

The method for diagnosis is preferably one in which the sequence is determined by a 
method selected from amplification refractory mutation system, minisequencing and restriction 
fragment length polymorphism. 
25 In another aspect of the invention we provide a method for the diagnosis of cu integrin 

subunit ligand-mediated disease, which method comprises: 

i) obtaining sample nucleic acid from an individual, 

ii) detecting the presence or absence of a variant nucleotide at one or more of positions: 
positions 740, 2273, 2446, 331 1 and 3506 in the coding region of a 4 integrin subunit gene as 

30 defined by the positions in EMBL ACCESSION NO. L12002, 

position 967 in the promoter region of ot 4 integrin subunit gene as defined by the position in 
EMBL ACCESSION NO. L26509; and 
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positions 184, 238, 331, 436, 676, 1010, or 1115 in the promoter region of ouintegrin subunit 
gene as defined by the position in EMBL ACCESSION NO. M26841; and 
iii) determining the status of the individual by reference to polymorphism in the a 4 integrin 
subunit gene. 

5 Preferred variations are as follows. Allelic variation at position 740 consists of a single 

base substitution from C (the published base), preferably to T. Allelic variation at position 
2273 consists of a single base substitution from A (the published base), preferably to G. Allelic 
variation at position 2446 consists of a single base substitution from C (the published base), 
preferably to T. Allelic variation at position 3311 consists of a single base substitution from T 

10 (the published base), preferably to C. Allelic variation at position 3506 consists of a single 
base substitution from C (the published base), preferably to T. Allelic variation at position 
967 consists of a single base substitution from G (the published base), preferably to A. Allelic 
variation at position 1 84 consists of a single base substitution from A (the published base), 
preferably to G. Allelic variation at position 238 consists of a single base substitution from C 

1 5 (the published base), preferably to T. Allelic variation at position 33 1 consists of a single base 
substitution from C (the published base), preferably to T. Allelic variation at position 436 
consists of a single base substitution from C (the published base), preferably to T. Allelic 
variation at position 676 consists of a single base substitution from C (the published base), 
preferably to T. Allelic variation at position 1010 consists of a single base substitution from C 

20 (the published base), preferably to A. Allelic variation at position 1115 consists of a single 
base substitution from C (the published base), preferably to T. 

The status of the individual may be determined by reference to allelic variation at any 
one or more positions optionally in combination with any other polymorphism in the gene that 
is (or becomes) known. 

25 The test sample of nucleic acid is conveniently a sample of blood, bronchoalveolar 

lavage fluid, sputum, or other body fluid or tissue obtained from an individual. It will be 
appreciated that the test sample may equally be a nucleic acid sequence corresponding to the 
sequence in the test sample, that is to say that all or a part of the region in the sample nucleic 
acid may firstly be amplified using any convenient technique e.g. PCR, before analysis of allelic 

30 variation. 

It will be apparent to the person skilled in the art that there are a large number of 
analytical procedures which may be used to detect the presence or absence of variant 
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nucleotides at one or more polymorphic positions of the invention. In general, the detection of 
allelic variation requires a mutation discrimination technique, optionally an amplification 
reaction and optionally a signal generation system. Table 1 lists a number of mutation 
detection techniques, some based on the PCR, These may be used in combination with a 
5 number of signal generation systems, a selection of which is listed in Table 2. Further 
amplification techniques are listed in Table 3. Many current methods for the detection of 
allelic variation are reviewed by Nollau et al. 9 Clin. Chem. 43, 1 114-1 120, 1997; and in 
standard textbooks, for example "Laboratory Protocols for Mutation Detection", Ed. by U. 
Landegren, Oxford University Press, 1996 and "PCR", 2 nd Edition by Newton & Graham, 
10 BIOS Scientific Publishers Limited, 1997. 
Abbreviations: 



ALEX™ 


Amplification refractory mutation system linear extension 


APEX 


Arrayed primer extension 


ARMS™ 


Amplification refractory mutation system 


b-DNA 


Branched DNA 


CMC 


Chemical mismatch cleavage 


bp 


base pair 


COPS 


Competitive oligonucleotide priming system 


DGGE 


Denaturing gradient gel electrophoresis 


FRET 


Fluorescence resonance energy transfer 


LCR 


Ligase chain reaction 


MAdCAM-1 


mucosal addressin cell adhesion molecule- 1 


MALDITOF- 
MS 


matrix assisted laser desorption ionisation time of flight mass 
spectrometry 


MASDA 


Multiple allele specific diagnostic assay 


NASBA 


Nucleic acid sequence based amplification 


OLA 


Oligonucleotide ligation assay 


PCR 


Polymerase chain reaction 


PTT 


Protein truncation test 


RFLP 


Restriction fragment length polymorphism 


SDA 


Strand displacement amplification 
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SNP 


Single nucleotide polymorphism 


SSCP 


Single-strand conformation polymorphism analysis 


SSR 


Self sustained replication 


TGGE 


Temperature gradient gel electrophoresis 


VCAM-1 


Vascular Cell Adhesion Molecule- 1 


VLA-4 


Very Late Antigen-4 



Table 1 - Mutation Detection Techniques 
General: DNA sequencing, Sequencing by hybridisation 
5 Scanning: PTT*, SSCP, DGGE, TGGE, Cleavase, Heteroduplex analysis, CMC, Enzymatic 
mismatch cleavage 

* Note: not useful for detection of promoter polymorphisms. 
Hybridisation Based 

Solid phase hybridisation: Dot blots, MASDA, Reverse dot blots, Oligonucleotide 
10 arrays (DNA Chips) 

Solution phase hybridisation: Taqman™ - US-5210015 & US-5487972 (HofFmann-La 
Roche), Molecular Beacons - Tyagi et al (1996), Nature Biotechnology, 14, 303; WO 
95/13399 (Public Health Inst., New York) 

Extension Based: ARMS™, ALEX™ - European Patent No. EP 332435 Bl (Zeneca 
15 Limited), COPS - Gibbs et al (1989), Nucleic Acids Research, 17, 2347. 
Incorporation Based: Mini-sequencing, APEX 
Restriction Enzyme Based: RFLP, Restriction site generating PCR 
Ligation Based: OLA 
Other: Invader assay 

20 

Table 2 - Signal Generation or Detection Systems 

Fluorescence: FRET, Fluorescence quenching, Fluorescence polarisation - United Kingdom 
Patent No. 2228998 (Zeneca Limited) 

Other: Chemiluminescence, Electrochemiluminescence, Raman, Radioactivity, Colorimetric, 
25 Hybridisation protection assay, Mass spectrometry e.g. MALDITOF-MS 
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Table 3 - Further Amplification Methods 
SSR, NASBA, LCR, SDA, b-DNA 

Preferred mutation detection techniques include ARMS™, ALEX™, COPS, Taqman, 
5 Molecular Beacons, RFLP, and restriction site based PCR and FRET techniques. 

Particularly preferred methods include ARMS™ and RFLP based methods. ARMS™ 
is an especially preferred method. 

In a further aspect, the diagnostic methods of the invention are used to assess the 
efficacy of therapeutic compounds in the treatment of a 4 integrin subunit ligand mediated 
10 diseases such as autoimmune, allergic and vascular inflammatory diseases. 

Assays, for example reporter-based assays, may be devised to detect whether one or 
more of the above polymorphisms affect transcription levels and/or message stability. 

Individuals who carry particular allelic variants of the a 4 integrin subunit gene may 
therefore exhibit differences in their ability to regulate protein biosynthesis under different 
1 5 physiological conditions and may display altered abilities to react to different diseases. In 
addition, differences in protein regulation arising as a result of allelic variation may have a 
direct effect on the response of an individual to drug therapy. The diagnostic methods of the 
invention may be useful both to predict the clinical response to such agents and to determine 
therapeutic dose. 

20 In a further aspect, the diagnostic methods of the invention, are used to assess the 

predisposition and/or susceptibility of an individual to diseases mediated by a 4 integrin subunit 
ligands. This may be particularly relevant in the development of autoimmune, allergic and 
vascular inflammatory diseases and other diseases which are modulated by 04 integrin subunit 
interactions. The present invention may be used to recognise individuals who are particularly at 

25 risk from developing these conditions. 

In a further aspect, the diagnostic methods of the invention are used in the development 
of new drug therapies which selectively target one or more allelic variants of the ct 4 integrin 
subunit gene. Identification of a link between a particular allelic variant and predisposition to 
disease development or response to drug therapy may have a significant impact on the design 

30 of new drugs. Drugs may be designed to regulate the biological activity of variants implicated 
in the disease process whilst minimising effects on other variants. 
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In a further diagnostic aspect of the invention the presence or absence of variant 
nucleotides is detected by reference to the loss or gain of, optionally engineered, sites 
recognised by restriction enzymes. In the accompanying Example 2 we provide details of 
convenient engineered restriction enzyme sites that are lost or gained as a result of a 
5 polymorphism of the invention. 

According to another aspect of the present invention there is provided a nucleic acid 
comprising any one of the following polymorphisms: 

the nucleic acid of EMBL ACCESSION No. L12002 with T at position 740 as defined by the 
position in EMBL ACCESSION No. L12002; 
10 the nucleic acid of EMBL ACCESSION No. L12002 with G at position 2273 as defined by the 
position in EMBL ACCESSION No. L12002; 

the nucleic acid of EMBL ACCESSION No. L12002 with T at position 2446 as defined by the 
position in EMBL ACCESSION No. LI 2002; 

the nucleic acid of EMBL ACCESSION No. L12002 with C at position 33 1 1 as defined by the 



1 5 position in EMBL ACCESSION No. LI 2002; 

the nucleic acid of EMBL ACCESSION No. L12002 with T at position 3506 as defined by the 
position in EMBL ACCESSION No. LI 2002; 

the nucleic acid of EMBL ACCESSION No. L26059 with A at position 967 as defined by the 
position in EMBL ACCESSION No. L26059; 



20 the nucleic acid of EMBL ACCESSION No. M26841 with G at position 184 as defined by the 
position in EMBL ACCESSION No. M26841; 

the nucleic acid of EMBL ACCESSION No. M26841 with T at position 238 as defined by the 
position in EMBL ACCESSION No. M26841; 

the nucleic acid of EMBL ACCESSION No. M26841 with T at position 33 1 as defined by the 
25 position in EMBL ACCESSION No. M26841 ; 

the nucleic acid of EMBL ACCESSION No. M26841 with T at position 436 as defined by the 
position in EMBL ACCESSION No. M26841; 

the nucleic acid of EMBL ACCESSION No. M26841 with T at position 676 as defined by the 
position in EMBL ACCESSION No. M26841; 
30 the nucleic acid of EMBL ACCESSION No. M26841 with A at position 1010 as defined by 
the position in EMBL ACCESSION No. M26841; 
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the nucleic acid of EMBL ACCESSION No. M26841 with T at position 1 1 15 as defined by 
the position in EMBL ACCESSION No. M26841; 

or a complementary strand thereof or an antisense sequence for a coding region or a fragment 
thereof of at least 20 bases comprising at least one polymorphism. 
5 Fragments are at least 17 bases, more preferably at least 20 bases, more preferably at 

least 30 bases. The nucleic acid of the invention does not encompass naturally occuring 
nucleic acid as it occurs in nature, for example, the nucleic acid is at least partially purified 
from at least one component with which it occurs naturally. Preferably the nucleic acid is at 
least 30% pure, more preferably at least 60% pure, more preferably at least 90% pure, more 

10 preferably at least 95% pure, and more preferably at least 99% pure. 

Novel sequence disclosed herein, may be used in another embodiment of the invention 
to regulate expression of the gene in cells by the use of antisense constructs. To enable 
methods of down-regulating expression of the gene of the present invention in mammalian 
cells, an example antisense expression construct can be readily constructed for instance using 

15 the pREPIO vector (Invitrogen Corporation). Transcripts are expected to inhibit translation 
of the gene in cells transfected with this type construct. Antisense transcripts are effective for 
inhibiting translation of the native gene transcript, and capable of inducing the effects (e.g., 
regulation of tissue physiology) herein described. Oligonucleotides which are complementary 
to and hybridizable with any portion of novel gene mRNA disclosed herein are contemplated 

20 for therapeutic use. U.S. Patent No. 5,639,595, Identification of Novel Drugs and Reagents, 
issued Jun. 17, 1997, wherein methods of identifying oligonucleotide sequences that display in 
vivo activity are thoroughly described, is herein incorporated by reference. Expression vectors 
containing random oligonucleotide sequences derived from previously known polynucleotides 
are transformed into cells. The cells are then assayed for a phenotype resulting from the 

25 desired activity of the oligonucleotide. Once cells with the desired phenotype have been 
identified, the sequence of the oligonucleotide having the desired activity can be identified. 
Identification may be accomplished by recovering the vector or by polymerase chain reaction 
(PCR) amplification and sequencing the region containing the inserted nucleic acid 
materialnucleotide molecules can be synthesized for antisense therapy. These antisense 

30 molecules may be DNA, stable derivatives of DNA such as phosphorothioates or 

methylphosphonates, RNA, stable derivatives of RNA such as 2*-0-alkylRNA, or other 
oligonucleotide mimetics. U.S. Patent No. 5,652,355, Hybrid Oligonucleotide 
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Phosphorothioates, issued July 29, 1997, and U.S. Patent No. 5,652,356, Inverted Chimeric 
and Hybrid Oligonucleotides, issued July 29, 1997, which describe the synthesis and effect of 
physiologically-stable antisense molecules, are incorporated by reference. Antisense molecules 
may be introduced into cells by microinjection, liposome encapsulation or by expression from 
5 vectors harboring the antisense sequence. 

According to another aspect of the invention there is provided use of a nucleic acid 
sequence comprising at least one of the polymorphisms in the promoter disclosed herein to 
identify compounds that modify expression of the human ou integrin subunit gene. 
Modification of expression includes inhibition or enhancement of expression. This is 

10 conveniently done by measuring expression levels of a reporter gene (for example beta- 
galactosidase) under the control of the promoter in transfected host cells in the presence or 
absence of test compounds. Suitable test compounds include polynucleotides capable of 
binding to the promoter through triplex strand formation. Accordingly, suitable compounds 
can be identified for therapeutic use which alter native gene expression either up or down as 

1 5 appropriate for the relevant disease to be treated. The reader is directed to the following 
references on nucleic acid triplex formation and uses: Progress in developments of Triplex- 
Based strategies: Giovannangeli C; Helene C: Antisense and Nucleic Acid Drug Development 
/ 7/4 (413-421) /1997; Recent developments in triple-helix regulation of gene expression: 
Neidle S: Anti-Cancer Drug Design / 12/5 (433-442) /1997; Triplex DNA: Fundamentals, 

20 advances, and potential applications for gene therapy: Chan PP; Glazer PM 

: Journal of Molecular Medicine / 75/4 (267-282) /1997; Oligonucleotide directed triple helix 
formation: Sun J-S; Garestier T; Helene C: Current Opinion in Structural Biology / 6/3 (327- 
333) /1996; C Mayfield, M Squibb, D Miller (1994) Inhibition of nuclear protein binding to the 
human Ki-ras promoter by triplex-forming oligonucleotides Biochemistry 33,3358-3363; 

25 WM Olivas, LJ Maher (1996) Binding of DNA oligonucleotides to sequences in the promoter 
of the human bcl-2 gene Nucleic Acids Research 24, 1758-1764; C Mayfield, S Ebinghaus, J 
Gees, D Jones, B Rodu, M Squibb, D Miller (1994) Triplex formation by the human HA-ras 
promoter inhibits Spl binding and in vitro transcription J Biol Chem 269,18232-18238; and JE 
Gee, GR Revankar, TS Rao, ME Hogan (1995) Triplex formation at the rat neu gene utilizing 

30 imidazole and 2'-deoxy-6-thioguanosine base substitutions Biochemistry 34,2042-2048. 

According to another aspect of the present invention there is provided a computer 
readable medium comprising at least one novel polynucleotide sequence of the invention 




• 
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stored on the medium. The computer readable medium may be used, for example, in 
homology searching, mapping, haplotyping, genotyping or pharmacogenetic analysis or any 
other bioinformatic analysis. The reader is referred to Bioinformatics, A practical guide to the 
analysis of genes and proteins, Edited by A D Baxevanis & B F F Ouellette, John Wiley & 
5 Sons, 1988. Any computer readable medium may be used, for example, compact disk, tape, 
floppy disk, hard drive or computer chips. 

The polynucleotide sequences of the invention, or parts thereof, particularly those 
relating to and identifying the single nucleotide polymorphisms identified herein represent a 
valuable information source, for example, to characterise individuals in terms of haplotype and 
10 other sub-groupings, such as investigation of susceptibility to treatment with particular drugs. 
These approaches are most easily facilitated by storing the sequence information in a computer 
readable medium and then using the information in standard bioinformatics programs or to 
search sequence databases using state of the art searching tools such as "GCC". Thus, the 
polynucleotide sequences of the invention are particularly useful as components in databases 



15 useful for sequence identity and other search analyses. As used herein, storage of the sequence 
information in a computer readable medium and use in sequence databases in relation to 
'polynucleotide or polynucleotide sequence of the invention 5 covers any detectable chemical 
or physical characteristic of a polynucleotide of the invention that may be reduced to, 
converted into or stored in a tangible medium, such as a computer disk, preferably in a 

20 computer readable form. For example, chromatographic scan data or peak data, photographic 
scan or peak data, mass spectrographic data, sequence gel (or other) data. 

The invention provides a computer readable medium having stored thereon one or a 
more polynucleotide sequences of the invention. For example, a computer readable medium is 
provided comprising and having stored thereon a member selected from the group consisting 

25 of: a polynucleotide comprising the sequence of a polynucleotide of the invention, a 
polynucleotide consisting of a polynucleotide of the invention, a polynucleotide which 
comprises part of a polynucleotide of the invention, which part includes at least one of the 
polymorphisms of the invention, a set of polynucleotide sequences wherein the set includes at 
least one polynucleotide sequence of the invention, a data set comprising or consisting of a 

30 polynucleotide sequence of the invention or a part thereof comprising at least one of the 
polymorphisms identified herein. A computer based method is also provided for performing 
sequence identification, said method comprising the steps of providing a polynucleotide 
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sequence comprising a polymorphism of the invention in a computer readable medium; and 
comparing said polymorphism containing polynucleotide sequence to at least one other 
polynucleotide or polypeptide sequence to identify identity (homology), i.e. screen for the 
presence of a polymorphism. 
5 The invention further provides nucleotide primers which can detect the polymorphisms 

of the invention. 

According to another aspect of the present invention there is provided an allele specific 
primer capable of detecting a a 4 integrin subunit gene polymorphism at one or more of 
positions: 

10 positions 740, 2273, 2446, 331 1 and 3506 in the coding region of a 4 integrin subunit gene as 
defined by the positions in EMBL ACCESSION NO. L12002; 

position 967 in the promoter region of ot 4 integrin subunit gene as defined by the position in 
EMBL ACCESSION NO. L26509; and 

positions 184, 238, 331, 436, 676, 1010, or 1115 in the promoter region of a 4 integrin subunit 

1 5 gene as defined by the position in EMBL ACCESSION NO. M26841 ; 

An allele specific primer is used, generally together with a constant primer, in an 
amplification reaction such as a PCR reaction, which provides the discrimination between 
alleles through selective amplification of one allele at a particular sequence position e.g. as 
used for ARMS™ assays. The allele specific primer is preferably 17- 50 nucleotides, more 

20 preferably about 17-35 nucleotides, more preferably about 17-30 nucleotides. 

An allele specific primer preferably corresponds exactly with the allele to be detected 
but derivatives thereof are also contemplated wherein about 6-8 of the nucleotides at the 3' 
terminus correspond with the allele to be detected and wherein up to 10, such as up to 8, 6, 4, 
2, or 1 of the remaining nucleotides may be varied without significantly affecting the properties 

25 of the primer. 

Primers may be manufactured using any convenient method of synthesis. Examples of 
such methods may be found in standard textbooks, for example "Protocols for 
Oligonucleotides and Analogues; Synthesis and Properties," Methods in Molecular Biology 
Series; Volume 20; Ed. Sudhir Agrawal, Humana ISBN: 0-89603-247-7; 1993; 1 st Edition. If 
30 required the primer(s) may be labelled to facilitate detection. 
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According to another aspect of the present invention there is provided an allele-specific 
oligonucleotide probe capable of detecting a a 4 integrin subunit gene polymorphism at one or 
more of positions: 

positions 740, 2273, 2446, 331 1 and 3506 in the coding region of oc 4 integrin subunit gene as 
5 defined by the positions in EMBL ACCESSION NO. L12002; 

position 967 in the promoter region of oc 4 integrin subunit gene as defined by the position in 
EMBL ACCESSION NO. L26509; and 

positions 184, 238, 331, 436, 676, 1010, or 1115 in the promoter region of ot 4 integrin subunit 
gene as defined by the position in EMBL ACCESSION NO. M26841 ; 

10 The allele-specific oligonucleotide probe is preferably 17- 50 nucleotides, more 

preferably about 17-35 nucleotides, more preferably about 17-30 nucleotides. 

The design of such probes will be apparent to the molecular biologist of ordinary skill. 
Such probes are of any convenient length such as up to 50 bases, up to 40 bases, more 
conveniently up to 30 bases in length, such as for example 8-25 or 8-15 bases in length. In 

15 general such probes will comprise base sequences entirely complementary to the corresponding 
wild type or variant locus in the gene. However, if required one or more mismatches may be 
introduced, provided that the discriminatory power of the oligonucleotide probe is not unduly 
affected. The probes of the invention may carry one or more labels to facilitate detection. 

According to another aspect of the present invention there is provided a diagnostic kit 

20 comprising an allele specific oligonucleotide probe of the invention and/or an allele-specific 
primer of the invention. 

The diagnostic kits may comprise appropriate packaging and instructions for use in the 
methods of the invention. Such kits may further comprise appropriate bufFer(s) and 
polymerase(s) such as thermostable polymerases, for example taq polymerase. 

25 In another aspect of the invention, the single nucleotide polymorphisms of this 

invention may be used as genetic markers in linkage studies. This particularly applies to the 
polymorphisms at 2273 and/or 3311 and/ or 1010 because of their relatively high frequency 
(see below). The a 4 integrin subunit gene has been mapped to chromosome 2q3 l-q32 
(Fernandez-Ruiz et al, Europ. J. Immunol. 22: 587-590, 1992). 
30 Low frequency polymorphisms may be particularly useful for haplotyping as described 

below. A haplotype is a set of alleles found at linked polymorphic sites (such as within a gene) 
on a single (paternal or maternal) chromosome. If recombination within the gene is random, 
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there may be as many as 2 n haplotypes, where 2 is the number of alleles at each SNP and n is 
the number of SNPs. One approach to identifying mutations or polymorphisms which are 
correlated with clinical response is to carry out an association study using all the haplotypes 
that can be identified in the population of interest. The frequency of each haplotype is limited 

5 by the frequency of its rarest allele, so that SNPs with low frequency alleles are particularly 
useful as markers of low frequency haplotypes. As particular mutations or polymorphisms 
associated with certain clinical features, such as adverse or abnormal events, are likely to be of 
low frequency within the population, low frequency SNPs may be particularly useful in 
identifying these mutations (for examples see: Linkage disequilibrium at the cystathionine beta 

10 synthase (CBS) locus and the association between genetic variation at the CBS locus and 
plasma levels of homocysteine. Ann Hum Genet (1998) 62:481-90, De Stefano V, Dekou V, 
Nicaud V, Chasse JF, London J, Stansbie D, Humphries SE, and Gudnason V; and Variation at 
the von willebrand factor (vWF) gene locus is associated with plasma vWF:Ag levels: 
identification of three novel single nucleotide polymorphisms in the vWF gene promoter. Blood 

1 5 (1 999) 93:4277-83, Keightley AM, Lam YM, Brady JN, Cameron CL, Lillicrap D). 

According to another aspect of the present invention there is provided an allelic variant 
of the human integrin a 4 polypeptide having a methionine at position 679 or a fragment thereof 
comprising at least 10 amino acids provided that the fragment comprises the allelic variant at 
position 679. 

20 Fragments of integrin a 4 polypeptide are at least 10 amino acids, more preferably at 

least 15 amino acids, more preferably at least 20 amino acids. The polypeptide of the invention 
does not encompass naturally occuring polypeptide as it occurs in nature, for example, the 
polypeptide is at least partially purified from at least one component with which it occurs 
naturally. Preferably the polypeptide is at least 30% pure, more preferably at least 60% pure, 

25 more preferably at least 90% pure, more preferably at least 95% pure, and more preferably at 
least 99% pure. 

According to another aspect of the present invention there is provided an antibody 
specific for an allelic variant of human integrin a 4 polypeptide having a methionine at position 
679 or a fragment thereof comprising at least 10 amino acids provided that the fragment 
30 comprises the allelic variant at position 679 

Antibodies can be prepared using any suitable method. For example, purified 
polypeptide may be utilized to prepare specific antibodies. The term "antibodies" is meant to 
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include polyclonal antibodies, monoclonal antibodies, and the various types of antibody 
constructs such as for example F(ab') 2 , Fab and single chain Fv. Antibodies are defined to be 
specifically binding if they bind the T679M variant of integrin ot 4 with a K a of greater than or 
equal to about 10 7 M" 1 . Affinity of binding can be determined using conventional techniques, 
5 for example those described by Scatchard et al., Ann. K Y. Acad Sci., 51:660 (1949). 

Polyclonal antibodies can be readily generated from a variety of sources, for example, 
horses, cows, goats, sheep, dogs, chickens, rabbits, mice or rats, using procedures that are 
well-known in the art. In general, antigen is administered to the host animal typically through 
parenteral injection. The immunogenicity of antigen may be enhanced through the use of an 

10 adjuvant, for example, Freund's complete or incomplete adjuvant. Following booster 
immunizations, small samples of serum are collected and tested for reactivity to antigen. 
Examples of various assays useful for such determination include those described in: 
Antibodies: A Laboratory Manual, Harlow and Lane (eds.), Cold Spring Harbor Laboratory 
Press, 1988; as well as procedures such as countercurrent immuno-electrophoresis (CIEP), 

15 radioimmunoassay, radioimmunoprecipitation, enzyme-linked immuno-sorbent assays 

(ELISA), dot blot assays, and sandwich assays, see U.S. Patent Nos. 4,376,1 10 and 4,486,530. 

Monoclonal antibodies may be readily prepared using well-known procedures, see for 
example, the procedures described in U.S. Patent Nos. RE 32,01 1, 4,902,614, 4,543,439 and 
4,41 1,993; Monoclonal Antibodies, Hybridomas: A New Dimension in Biological Analyses, 

20 Plenum Press, Kennett, McKearn, and Bechtol (eds ), (1980). 

The monoclonal antibodies of the invention can be produced using alternative 
techniques, such as those described by Alting-Mees et al., "Monoclonal Antibody Expression 
Libraries: A Rapid Alternative to Hybridomas", Strategies in Molecular Biology 3: 1-9 (1990) 
which is incorporated herein by reference. Similarly, binding partners can be constructed using 

25 recombinant DNA techniques to incorporate the variable regions of a gene that encodes a 
specific binding antibody. Such a technique is described in Larrick et al, Biotechnology, 7: 
394 (1989). 

Once isolated and purified, the antibodies may be used to detect the presence of antigen 
in a sample using established assay protocols. 
30 According to another aspect of the present invention there is provided a method of 

treating a human in need of treatment with a a 4 integrin subunit ligand antagonist drug in 
which the method comprises: 
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i) diagnosis of a single nucleotide polymorphism in a 4 integrin subunit gene in the 
human, which diagnosis comprises determining the sequence of the nucleic acid at one or more 
of positions: 

positions 740, 2273, 2446, 3311 and 3506 in the coding region of a 4 integrin subunit gene as 
5 defined by the positions in EMBL ACCESSION NO. LI 2002; 

position 967 in the promoter region of a 4 integrin subunit gene as defined by the position in 
EMBL ACCESSION NO. L26509; and 

positions 184, 238, 33 1, 436, 676, 1010, or 1 1 15 in the promoter region of a 4 integrin subunit 
gene as defined by the position in EMBL ACCESSION NO. M26841 ; 
10 and determining the status of the human by reference to polymorphism in the a 4 integrin 
subunit gene; and 

ii) administering an effective amount of a a 4 integrin subunit ligand antagonist . 
Preferably determination of the status of the human is clinically useful. Examples of 

clinical usefulness include deciding which antagonist drug or drugs to administer and/or in 

1 5 deciding on the effective amount of the drug or drugs. 

a 4 integrin subunit ligand antagonist drugs have been disclosed in the following 
publications: international patent application WO 97/4973 1, Zeneca Limited; international 
patent application WO 97/02289, Zeneca Limited; international patent application WO 
96/20216, Zeneca Limited; US patent 5510332, Texas Biotechnology; international patent 

20 application WO 96/01644, Athena Neurosciences; international patent application WO 

96/01644, Athena Neurosciences and; international patent application WO 96/00581, Zeneca 
Limited. A a 4 integrin subunit ligand antagonist drug may act directly at a 4 integrin subunit 
heterodimer and/or at a ligand, such as VCAM, CS-1 fibronectin or MAdCAM-1 which binds 
to a 4 integrin subunit heterodimers, a 4 0i or a 4 p?. VLA-4 antagonists as anti-inflammatory 

25 agents have been reviewed by Lin KC & Castro AC in Curr. Opin. Chem. Biol. (1998), 2: 453- 
457. 

According to another aspect of the present invention there is provided use of a a 4 
integrin subunit ligand antagonist drug in preparation of a medicament for treating a a 4 
integrin subunit ligand mediated disease in a human diagnosed as having a single nucleotide 
30 polymorphism at one or more of positions: 

positions 740, 2273, 2446, 3311 and 3506 in the coding region of a 4 integrin subunit gene as 
defined by the positions in EMBL ACCESSION NO. L12002; 
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position 967 in the promoter region of cu integrin subunit gene as defined by the position in 
EMBL ACCESSION NO. L26509; and 

positions 184, 238, 331, 436, 676, 1010, or 1 1 15 in the promoter region of a 4 integrin subunit 
gene as defined by the position in EMBL ACCESSION NO. M2684L 
5 According to another aspect of the present invention there is provided a pharmaceutical 

pack comprising a 4 integrin subunit ligand antagonist drug and instructions for administration 
of the drug to humans diagnostically tested for a single nucleotide polymorphism at one or 
more of positions: 

positions 740, 2273, 2446, 331 1 and 3506 in the coding region of a 4 integrin subunit gene as 
1 0 defined by the positions in EMBL ACCESSION NO. L12002; 

position 967 in the promoter region of oc 4 integrin subunit gene as defined by the position in 
EMBL ACCESSION NO. L26509; and 

positions 184, 238, 331, 436, 676, 1010, or 1115 in the promoter region of a 4 integrin subunit 
gene as defined by the position in EMBL ACCESSION NO. M26841 . 
1 5 The invention will now be illustrated but not limited by reference to the following 

Examples. All temperatures are in degrees Celsius. 

In the Examples below, unless otherwise stated, the following methodology and 
materials have been applied. 

AMPLITAQ™ available from Perkin-Elmer Cetus, is used as the source of 
20 thermostable DNA polymerase. 

General molecular biology procedures can be followed from any of the methods 
described in "Molecular Cloning - A Laboratory Manual" Second Edition, Sambrook, Fritsch 
and Maniatis (Cold Spring Harbor Laboratory, 1989). 

Electropherograms were obtained in a standard manner; data was collected by ABI377 
25 data collection software and the wave form generated by ABI Prism sequencing analysis 
(2.1.2). 
Example 1 

Identification of Polymorphisms 
1. Methods 

30 c-DNA Preparation 
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RNA was prepared from lymphoblastoid cell lines from Caucasian donors using 
standard laboratory protocols (Chomczynski and Sacchi, Anal. Biochem. 162, 156-159, 1987) 
and used to generate first strand cDNA (Gubler and Hoffman, Gene 25, 263-269, 1983). 
Genomic DNA Preparation 
5 DNA was prepared from frozen blood samples collected in EDTA following protocol I 

(Molecular Cloning: A Laboratory Manual, p392, Sambrook, Fritsch and Maniatis, 2 nd Edition, 
Cold Spring Harbor Press, 1989) with the following modifications. The thawed blood was 
diluted in an equal volume of standard saline citrate instead of phosphate buffered saline to 
remove lysed red blood cells. Samples were extracted with phenol, then phenol/chloroform 

10 and then chloroform rather than with three phenol extractions. The DNA was dissolved in 
deionised water. 
Template Preparation 

Templates were prepared by PCR using the oligonucleotide primers and annealing 
temperatures set out below. The extension temperature was 72° and denaturation temperature 

15 94°; each step was 1 minute. Generally 100 pg cDNA or 50ng genomic DNA was used in 
each reaction and subjected to 40 cycles of PCR. 



cDNA 
Fragment 


Forward 
Oligo 


Reverse 
Oligo 


Annealing 
Temp 


MgCh 


DMSO 


290-811 


290-310 


790-811 


64° 


l.OmM 


5 % 


2119-2630 


2119-2139 


2609-2630 


64° 


1.5 mM 


0 


2961-3550 


2961-2982 


3528-3550 


60° 


l.OmM 


0 



For dye-primer sequencing the forward primers were modified to include Ml 3 forward 
20 sequence (ABI protocol P/N 4021 14, Applied Biosystems) at the 5' end of the 
oligonucleotides. 
Dye Primer Sequencing 

Dye-primer sequencing using Ml 3 forward and reverse primers was as described in the 
ABI protocol P/N 4021 14 for the ABI Prism™ dye primer cycle sequencing core kit with 
25 "AmpliTaq FS"™ DNA polymerase, modified in that the annealing temperature was 45° and 
DMSO was added to the cycle sequencing mix to a final concentration of 5 %. 
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The extension reactions for each base were pooled, ethanol/sodium acetate 
precipitated, washed and resuspended in formamide loading buffer. 

4.25 % Acrylamide gels were run on an automated sequencer (ABI 377, Applied 
Bio systems). 

5 

2. Results 
Novel Polymorphisms 

Integrin Alpha-4 cDNA 
10 EMBL Accession No LI 2002 
ID HSITGA4 

Ref Takada etal EMBO J. 8: 1361-1368, 1989 



Position 


Published 


Variant 


Amino acid 
change 


RFLP 


Frequency 


740 


C 


T 




NO 


1/52 


2273 


A 


G 




eng'+Acl I 


32/54 


2446 


C 


T 


Thr-Met 
(T679M) 


eng'+BspHI 


1/54 


3311 


T 


C 




eng'-Sph I 


29/54 


3506 


C 


T 




eng'+Spe I 


2/52 



15 Frequency is the allele frequency of the variant allele in European control subjects . 
"eng"' = engineered RFLP 
Example 2 

Engineered restriction site for detection of polymorphisms 

Standard methodology can be used to detect the polymorphism at positions 2273, 
20 2446, 33 1 1 and 3506 (as defined by the position in EMBL ACCESSION NO. L12002) based 
on the materials set out below using a cDNA template. 



Position 


Diagnostic Fragment 


Forward primer 


Reverse primer 


2273 


2119-2297 


2119-2139 


2274-2297 Acl I 
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2446 


2422-2630 


2422-2445 Bsp HI 


2609-2630 


3311 


2961-3335 


2961-2982 


3312-3335 Sph I 


3506 


3481-3564 


3481-3505 Spel 


3542-3564 



Primer Sequences 5 '-3' 

2274-2297 Acl I 
2422-2445 Bsp HI 
3312-3335 Sph I 
3481-3505 Spe I 



GGCACAAAACCTTGCAAAGTTTAA (SEQ ID NO:l) 
ATGCTGGAGATGATGCATATGTCA (SEQ ID NO: 2) 
ATGATGTAGTCCTTCCAGTAGAGC (SEQ ID NO: 3) 
GAAGAGACAGT TGGAGT TATATCAC (SEQ ID NO : 4 ) 



G at position 2273 creates an Acl I site in the diagnostic fragment, 21 19-2297, 
described above. T at position 2446 creates a Bsp HI site in the diagnostic fragment, 2422- 
10 2630, described above. T at position 3311 creates a Sph I site in the diagnostic fragment, 
2961-3335, described above. T at position 3506 creates a Spe I site in the diagnostic 
fragment, 3481-3564, described above. 



Example 3 

15 Integrin alpha-4 promoter polymorphisms 



The sequences scanned are covered by two EMBL entries, Accession nos L26059 and 
M62841 . The polymorphisms set out below were identified. 



Position 


Published 


Variant 


RFLP 


Frequency 


967 of L26059 


G 


A 


+ eng Msc I 


1/54 


184 ofM62841 


A 


G 


- Tsp 509 I 


1/64 


238 of M62841 


C 


T 


+ eng Dra I 


2/62 


331 ofM62841 


C 


T 


- Smal 


3/62 


436 of M62841 


C 


T 




1/66 


676 of M62841 


c 


T 


-Bfal 


1/66 


1010 of M62841 


c 


A 




10/60 


1115 ofM62841 


c 


T 


+ eng Eco RI 


1/60 
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Frequency is the allele frequency of the variant allele in control subjects. 

The following alterations in transcription factor binding sites associated with the 

polymorphisms are noted. 



Position ofM62841 


Gain of TF site 


Loss of TF site 


184 


HNF-1-HP1 rev 




238 


PEA CS-rev, HC1 rev, 
PEA3-RS rev, kappa4 




436 


AP2 CS4, Pu box rev 


Histone H4 CS2 rev 


1115 




Apa A 



Example 4 

Detection of Promoter Polymorphisms using Engineered RFLPs 

Engineered RFLPs were used to detect three of the promoter polymorphisms in a PCR 
assay as set out below. 



Position 


Diagnostic 
Fragment 


Forward Primer 


Reverse Primer 


L26059, 967 


944-1202 


944-966 Msc I 


1181-1202 


M62841, 238 


3-262 


3-32 


239-262 Dra I 


M62841, 1115 


1094-1267 


1094-11 14 EcoRI 


1248-1267 



Primers 

15 

944-966 Msc I ACTTCTGAAACCCAGAGCTGGCC (SEQ ID NO: 5) 

239-262 Dra I ACCCCAACAGAGAGGTTGGTTTAA ( SEQ ID NO: 6) 

1094-1114 Eco RI CCCGTTGGCCAACCGTCGAAT (SEQ ID NO: 7) 



A at position 967 (L26059) creates a Msc I site in the fragment 944-1202 using the 
primers described above. 
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T at position 238 (M62841) creates a Dra I site in the fragment 3-262 using the primers 
described above. 

T at position 1 1 15(M62841) creates an Eco RI site in the fragment 1094-1267 using 
the primers described above. 

5 

Example 5 

Detection of Polymorphism using Minisequencing 

Minisequencing was used to detect the promoter polymorphism at position 1010 of 
M62841 1010 as set out below. 
10 Oligonucleotide, GAAGAGGAGGGGAAGTCG (SEQ ID NO:8), was used as a 

primer in a minisequencing reaction. The C-A polymorphism is detected by the incorporation 
of ddGTP or ddTTP which can be resolved, for example, by MALDITOF-MS. 



15 



Sequence Listing Free Text 

<223> Description of Artificial Sequence:PCR primer 



